Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve range: refactor, support start as an optional kwarg, clearer docs and error messages #38041

Merged
merged 20 commits into from
Jan 12, 2021

Conversation

jw3126
Copy link
Contributor

@jw3126 jw3126 commented Oct 15, 2020

Mathematically a range is uniquely determined by three out of four of start, step, stop, length.
Furthermore if one assumes step=1 any combination of two others macthematically suffices to specify a range.

With this PR the range function reflects this. Any combination of three (two non step) arguments will be accepted. For instance now you can do:

julia> range(stop=10, step=1, length=2)
9:1:10

On master only a somewhat random subset of three (two non step) arguments will be accepted.

PR Features

  • Non breaking
  • More ergonomic
  • Easier docs

test/ranges.jl Outdated Show resolved Hide resolved
test/ranges.jl Outdated Show resolved Hide resolved
@jw3126
Copy link
Contributor Author

jw3126 commented Oct 20, 2020

@johnnychen94 do you want to review this?

base/range.jl Outdated Show resolved Hide resolved
@johnnychen94
Copy link
Member

Despite the flexibility this PR brings, which I think is a good change, I'm more concerned about the complexity. #37875 is trying to make the docs clearer, and I'm worried that this PR makes it even harder to clarify the usage.

Co-authored-by: Michael Abbott <32575566+mcabbott@users.noreply.github.com>
@jw3126
Copy link
Contributor Author

jw3126 commented Oct 20, 2020

Oh I was not aware of #37875. I don't think this implementation is more complex than the implementation on master. And I think the semantics of this PR are less complex. IMO with this PR it is much clearer, which signatures or range are supported and when a UnitRange is returned.
So I think docs of this PR will be less complex then correct docs of master.

@jw3126
Copy link
Contributor Author

jw3126 commented Nov 1, 2020

I think CI fail is unrelated.

@jw3126
Copy link
Contributor Author

jw3126 commented Nov 3, 2020

Can this be reviewed and merged?

@jw3126
Copy link
Contributor Author

jw3126 commented Nov 6, 2020

@mcabbott can you review and/or suggest other reviewers?

Copy link
Member

@johnnychen94 johnnychen94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A second read, I think this change, in general, is better than #37875 since this provides a more versatile range API.

I believe it can be much more convincing if you could summary the changes with a table about what usage is new, and what return value has changed(if there is). By doing this people with merge permission can easily review the changes; the hard part of this PR is that code logic change is clear, but return value changes are not.

For example, the first line of the summary might be:

usage master this PR
range(stop=10, step=1, length=2) MethodError UnitRange(9:10)

base/range.jl Outdated Show resolved Hide resolved
Co-authored-by: Johnny Chen <johnnychen94@hotmail.com>
@jw3126
Copy link
Contributor Author

jw3126 commented Nov 6, 2020

I believe it can be much more convincing if you could summary the changes with a table about what usage is new, and what return value has changed(if there is). By doing this people with merge permission can easily review the changes; the hard part of this PR is that code logic change is clear, but return value changes are not.

That is a good suggestion. There are no changes to return values. There were initially, but I decided to aim for 100% backwards compatibility, to avoid discussions. All this PR does is add a bunch of methods to range.

julia> range(start=1, stop=10, length=10) # specify all arguments as keywords
1.0:1.0:10.0

julia> range(length=5, stop=10, step=1) # start can be computed now
6:1:10

julia> range(start=1, step=2, length=3) # any combination of three arguments works
1:2:5

julia> range(start=1, stop=10) # unit range will be returned for integer arguments and step unspecified
1:10

julia> range(start=1, length=3) # any combination of two arguments other then step works.
1:3

@johnnychen94
Copy link
Member

@StefanKarpinski I think this one is a nice refactor, and is good for review and merge, would you mind to take a look at it?

base/range.jl Outdated Show resolved Hide resolved
Co-authored-by: Johnny Chen <johnnychen94@hotmail.com>
@jw3126
Copy link
Contributor Author

jw3126 commented Nov 20, 2020

Bump can this be merged?

@timholy timholy added the triage This should be discussed on a triage call label Nov 20, 2020
@timholy
Copy link
Member

timholy commented Nov 20, 2020

I've put a triage tag on this to encourage discussion. I've not reviewed this carefully myself, but to me it seems that API changes to such a core function deserve discussion before it would be appropriate to merge this.

@mkitti
Copy link
Contributor

mkitti commented Dec 16, 2020

I like the clarity that the all keyword option gives, but I also feel like it is a lot to type. It would be great if we could unambiguously abbreviate them somehow while also being clear. I proposed a solution over in #38750 to provide explicit names for a position only interface. To summarize, if we map start to begin and stop to end, then the letters b, e, s, and l refer uniquely to begin, end, step, and length which are all existing keywords or functions elsewhere in Julia.

Current New Example New Abbreviation
start begin a[begin] b
stop end a[end] e
length length length(a) l
step step step(a) s

For a positional interface, would then have functions named range_{bels} where {bels} is some combination of three of the letters: range_bel, range_bes, range_bls, and range_els.

If we accept my one-letter mappings, would it make sense to also offer them as an alternate keyword interface in the context of this PR?

@antoine-levitt
Copy link
Contributor

Doing it like that feels pretty inconsistent with julia APIs, where multiple dispatch is usually used to have one big function with many methods rather than many functions with variants spelled in the name. Regarding the other points, I'll reply in the other thread.

@jw3126
Copy link
Contributor Author

jw3126 commented Dec 17, 2020

If we accept my one-letter mappings, would it make sense to also offer them as an alternate keyword interface in the context of this PR?

I would prefer not to have this in this PR. It can be a separate PR if we decide to do it.

@mkitti
Copy link
Contributor

mkitti commented Jan 7, 2021

One of the reasons I suggested

"If both arguments are Integers a UnitRange will be returned"

is this statement only applies when only two arguments are given. It seems strange to refer to two arguments as "all" arguments.

If three arguments are given, one does not get a unit range even when stop - start + 1 == length or step = 1

julia> range_start_stop_length(1, 3, 3)
3-element LinRange{Float64}:
 1.0,2.0,3.0

julia> range_start_step_length(1, 1, 3)
1:1:3

julia> range_start_step_stop(1, 1, 3)
1:1:3

Comment on lines +131 to +146
_range(start::Nothing, step::Nothing, stop::Nothing, len::Nothing) = range_error(start, step, stop, len)
_range(start::Nothing, step::Nothing, stop::Nothing, len::Any ) = range_error(start, step, stop, len)
_range(start::Nothing, step::Nothing, stop::Any , len::Nothing) = range_error(start, step, stop, len)
_range(start::Nothing, step::Nothing, stop::Any , len::Any ) = range_stop_length(stop, len)
_range(start::Nothing, step::Any , stop::Nothing, len::Nothing) = range_error(start, step, stop, len)
_range(start::Nothing, step::Any , stop::Nothing, len::Any ) = range_error(start, step, stop, len)
_range(start::Nothing, step::Any , stop::Any , len::Nothing) = range_error(start, step, stop, len)
_range(start::Nothing, step::Any , stop::Any , len::Any ) = range_step_stop_length(step, stop, len)
_range(start::Any , step::Nothing, stop::Nothing, len::Nothing) = range_error(start, step, stop, len)
_range(start::Any , step::Nothing, stop::Nothing, len::Any ) = range_start_length(start, len)
_range(start::Any , step::Nothing, stop::Any , len::Nothing) = range_start_stop(start, stop)
_range(start::Any , step::Nothing, stop::Any , len::Any ) = range_start_stop_length(start, stop, len)
_range(start::Any , step::Any , stop::Nothing, len::Nothing) = range_error(start, step, stop, len)
_range(start::Any , step::Any , stop::Nothing, len::Any ) = range_start_step_length(start, step, len)
_range(start::Any , step::Any , stop::Any , len::Nothing) = range_start_step_stop(start, step, stop)
_range(start::Any , step::Any , stop::Any , len::Any ) = range_error(start, step, stop, len)
Copy link
Member

@simeonschaub simeonschaub Jan 7, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we use @eval in a for loop here? This just seems a bit hard to maintain.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it looks cool 😛

Copy link
Contributor

@mkitti mkitti Jan 7, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could use some comments. I think @eval would make it more complicated. There are four options and 4^2 == 16 lines. 16 lines with a non-trivial mapping seems pretty manageable to me. We need to make it easier to read, not less.

How about something like this:

# range( )
_range(start::Nothing, step::Nothing, stop::Nothing, len::Nothing) = range_error(start, step, stop, len)
# range(; length )
_range(start::Nothing, step::Nothing, stop::Nothing, len::Any    ) = range_error(start, step, stop, len)
# range(; stop )
_range(start::Nothing, step::Nothing, stop::Any    , len::Nothing) = range_error(start, step, stop, len)
# range(; stop, length )
_range(start::Nothing, step::Nothing, stop::Any    , len::Any    ) = range_stop_length(stop, len)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks very self-documenting to my eyes with a clear pattern.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is navigable to me, but I need to parse it to find the correct line. I'm happy with how it is though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I put some thought into how to write down this dispatch mechanic. I also considered using @eval but decided the above table is the most readable and easy to maintain.

@JeffBezanson
Copy link
Member

Triage is in favor.

@JeffBezanson
Copy link
Member

Maybe the doc string doesn't need to mention when a UnitRange is returned? Does it really matter?

@StefanKarpinski
Copy link
Member

Triage approves of this PR, which wisely addresses only the non-controversial improvements to range 👍🏻

@mbauman mbauman removed the triage This should be discussed on a triage call label Jan 7, 2021
@mkitti
Copy link
Contributor

mkitti commented Jan 7, 2021

Maybe the doc string doesn't need to mention when a UnitRange is returned? Does it really matter?

There appear to be several non-trivial performance specializations based on having either AbstractUnitRange or UnitRange. It seems to matter. If they could be created using easy to use syntax, that would be appreciated.

However, the conclusion we reached in #37875 is this detail could be relegated to the Extended Help session.

julia> methodswith(UnitRange)
[1] firstindex(::UnitRange) in Base at range.jl:551
[2] float(r::UnitRange) in Base at float.jl:921
[3] getindex(t::Tuple, r::UnitRange) in Base at range.jl:293
[4] map(::Type{T}, r::UnitRange) where T<:Real in Base at abstractarray.jl:995
[5] show(io::IO, r::UnitRange) in Base at range.jl:742
[6] eigen(A::LinearAlgebra.SymTridiagonal{T,V} where V<:AbstractArray{T,1}, irange::UnitRange) where T in LinearAlgebra at C:\Users\kittisopikulm\AppData\Local\Programs\Julia 1.5.0\share\julia\stdlib\v1.5\LinearAlgebra\src\tridiag.jl:284
[7] eigen(A::Union{LinearAlgebra.Hermitian{Complex{T},S}, LinearAlgebra.Hermitian{T,S}, LinearAlgebra.Symmetric{T,S}} where S where T<:Real, irange::UnitRange) in LinearAlgebra at C:\Users\kittisopikulm\AppData\Local\Programs\Julia 1.5.0\share\julia\stdlib\v1.5\LinearAlgebra\src\symmetric.jl:699
[8] eigen!(A::LinearAlgebra.SymTridiagonal{var"#s826",V} where V<:AbstractArray{var"#s826",1} where var"#s826"<:Union{Float32, Float64}, irange::UnitRange) in LinearAlgebra at C:\Users\kittisopikulm\AppData\Local\Programs\Julia 1.5.0\share\julia\stdlib\v1.5\LinearAlgebra\src\tridiag.jl:282
[9] eigen!(A::Union{LinearAlgebra.Hermitian{Complex{var"#s826"},var"#s825"}, LinearAlgebra.Hermitian{var"#s826",var"#s825"}, LinearAlgebra.Symmetric{var"#s826",var"#s825"}} where var"#s825"<:(StridedArray{T, 2} where T) where var"#s826"<:Union{Float32, Float64}, irange::UnitRange) in LinearAlgebra at C:\Users\kittisopikulm\AppData\Local\Programs\Julia 1.5.0\share\julia\stdlib\v1.5\LinearAlgebra\src\symmetric.jl:680
[10] eigvals(A::LinearAlgebra.SymTridiagonal{T,V} where V<:AbstractArray{T,1}, irange::UnitRange) where T in LinearAlgebra at C:\Users\kittisopikulm\AppData\Local\Programs\Julia 1.5.0\share\julia\stdlib\v1.5\LinearAlgebra\src\tridiag.jl:297
[11] eigvals(A::Union{LinearAlgebra.Hermitian{Complex{T},S}, LinearAlgebra.Hermitian{T,S}, LinearAlgebra.Symmetric{T,S}} where S where T<:Real, irange::UnitRange) in LinearAlgebra at C:\Users\kittisopikulm\AppData\Local\Programs\Julia 1.5.0\share\julia\stdlib\v1.5\LinearAlgebra\src\symmetric.jl:775
[12] eigvals!(A::LinearAlgebra.SymTridiagonal{var"#s826",V} where V<:AbstractArray{var"#s826",1} where var"#s826"<:Union{Float32, Float64}, irange::UnitRange) in LinearAlgebra at C:\Users\kittisopikulm\AppData\Local\Programs\Julia 1.5.0\share\julia\stdlib\v1.5\LinearAlgebra\src\tridiag.jl:295
[13] eigvals!(A::Union{LinearAlgebra.Hermitian{Complex{var"#s826"},var"#s825"}, LinearAlgebra.Hermitian{var"#s826",var"#s825"}, LinearAlgebra.Symmetric{var"#s826",var"#s825"}} where var"#s825"<:(StridedArray{T, 2} where T) where var"#s826"<:Union{Float32, Float64}, irange::UnitRange) in LinearAlgebra at C:\Users\kittisopikulm\AppData\Local\Programs\Julia 1.5.0\share\julia\stdlib\v1.5\LinearAlgebra\src\symmetric.jl:746

julia> methodswith(AbstractUnitRange)
[1] checkindex(::Type{Bool}, inds::AbstractUnitRange, i::Real) in Base at abstractarray.jl:563
[2] checkindex(::Type{Bool}, inds::AbstractUnitRange, ::Colon) in Base at abstractarray.jl:564
[3] checkindex(::Type{Bool}, inds::AbstractUnitRange, ::Base.Slice) in Base at abstractarray.jl:565
[4] checkindex(::Type{Bool}, inds::AbstractUnitRange, r::AbstractRange) in Base at abstractarray.jl:566
[5] checkindex(::Type{Bool}, indx::AbstractUnitRange, I::Base.LogicalIndex) in Base at multidimensional.jl:695
[6] checkindex(::Type{Bool}, indx::AbstractUnitRange, I::AbstractArray{Bool,1}) in Base at abstractarray.jl:570
[7] checkindex(::Type{Bool}, indx::AbstractUnitRange, I::AbstractArray{Bool,N} where N) in Base at abstractarray.jl:571
[8] checkindex(::Type{Bool}, inds::AbstractUnitRange, I::AbstractArray) in Base at abstractarray.jl:572
[9] checkindex(::Type{Bool}, inds::AbstractUnitRange, i) in Base at abstractarray.jl:561
[10] findfirst(p::Union{Base.Fix2{typeof(==),T}, Base.Fix2{typeof(isequal),T}}, r::AbstractUnitRange) where T<:Integer in Base at array.jl:1867
[11] getindex(r::AbstractUnitRange, s::AbstractUnitRange{var"#s91"} where var"#s91"<:Integer) in Base at range.jl:695
[12] getindex(r::AbstractUnitRange, s::StepRange{var"#s91",S} where S where var"#s91"<:Integer) in Base at range.jl:709
[13] getindex(A::SparseArrays.AbstractSparseMatrixCSC{Tv,Ti} where Ti<:Integer, I::AbstractUnitRange) where Tv in SparseArrays at C:\Users\kittisopikulm\AppData\Local\Programs\Julia 1.5.0\share\julia\stdlib\v1.5\SparseArrays\src\sparsevector.jl:626
[14] getindex(x::SparseArrays.AbstractSparseMatrixCSC, I::AbstractUnitRange, j::Integer) in SparseArrays at C:\Users\kittisopikulm\AppData\Local\Programs\Julia 1.5.0\share\julia\stdlib\v1.5\SparseArrays\src\sparsevector.jl:531
[15] getindex(x::SparseArrays.AbstractSparseArray{Tv,Ti,1}, I::AbstractUnitRange) where {Tv, Ti} in SparseArrays at C:\Users\kittisopikulm\AppData\Local\Programs\Julia 1.5.0\share\julia\stdlib\v1.5\SparseArrays\src\sparsevector.jl:784
[16] isempty(r::AbstractUnitRange) in Base at range.jl:503
[17] issorted(r::AbstractUnitRange) in Base at range.jl:1011
[18] length(r::AbstractUnitRange) in Base at range.jl:545
[19] maximum(r::AbstractUnitRange) in Base at range.jl:602
[20] minimum(r::AbstractUnitRange) in Base at range.jl:601
[21] sort(r::AbstractUnitRange) in Base at range.jl:1014
[22] sort!(r::AbstractUnitRange) in Base at range.jl:1015
[23] sortperm(r::AbstractUnitRange) in Base at range.jl:1019
[24] step(r::AbstractUnitRange{T}) where T in Base at range.jl:528
[25] view(r1::AbstractUnitRange, r2::AbstractUnitRange{var"#s91"} where var"#s91"<:Integer) in Base at subarray.jl:167
[26] view(r1::AbstractUnitRange, r2::StepRange{var"#s91",S} where S where var"#s91"<:Integer) in Base at subarray.jl:171

@mbauman
Copy link
Member

mbauman commented Jan 7, 2021

In my mind this is covered by the very first line of documentation https://github.com/JuliaLang/julia/pull/38041/files#diff-8e63081442c6b1785e18b93e97805e8191e62860d828af6d50b530c90b72a847R54:

Construct a specialized array with evenly spaced elements and optimized storage (an AbstractRange) from the arguments.

@mkitti
Copy link
Contributor

mkitti commented Jan 7, 2021

AbstractRange is more general than AbstractUnitRange or UnitRange. While I'm not sure if this should be the first detail, it should be documented somewhere. Here's a common case:

julia> x = Base.OneTo(1_000_000) # Proposed as `range(1_000_000)`
Base.OneTo(1000000)

julia> y = range(1; stop = 1_000_000)
1:1000000

julia> z = range(1, 1_000_000; length=1_000_000)
1.0:1.0:1.0e6

julia> f = ==(100_000)
(::Base.Fix2{typeof(==),Int64}) (generic function with 1 method)

julia> @btime findfirst($f, $x)
  0.001 ns (0 allocations: 0 bytes)
100000

julia> @btime findfirst($f, $y)
  0.001 ns (0 allocations: 0 bytes)
100000

julia> @btime findfirst($f, $z)
  249.899 μs (0 allocations: 0 bytes)
100000

julia> isa.( (x,y,z), AbstractUnitRange )
(true, true, false)

Specifying length is costly.

@mbauman
Copy link
Member

mbauman commented Jan 7, 2021

I'll just note that you didn't use the range function there. :) I just don't think it needs to be documented here — you end up getting some object back and you can ask for its help if you are curious about it.

@mkitti
Copy link
Contributor

mkitti commented Jan 8, 2021

I'll just note that you didn't use the range function there. :) I just don't think it needs to be documented here — you end up getting some object back and you can ask for its help if you are curious about it.

I revised my post. Specifying length and stop at the same time is costly.

The scenario I have in mind is that the user has figured out what a UnitRange is and wants to know if this syntax will create one and under what circumstances. I think the answer is do not specify step or force it to be calculated.

Anyways, we figured it out and wrote the documentation up in #37875 and put it the Extended Help section here:

julia/base/range.jl

Lines 158 to 167 in 20b84f4

## `UnitRange` construction
To specify a [`UnitRange`](@ref) where `step` is 1, use one of the following
where `start`, `length`, and `stop` are all integers.
* range(start, length=length)
* range(start, stop=stop)
* `start:stop`
* `(:)(start,stop)`
Specifying a `step` of 1 explicitly, does not result in a [`UnitRange`](@ref).

Maybe the documentation actually belongs under UnitRange. I suppose we'll revisit the doc PR after this is merged.

@jw3126
Copy link
Contributor Author

jw3126 commented Jan 8, 2021

In my mind this is covered by the very first line of documentation https://github.com/JuliaLang/julia/pull/38041/files#diff-8e63081442c6b1785e18b93e97805e8191e62860d828af6d50b530c90b72a847R54:

Construct a specialized array with evenly spaced elements and optimized storage (an AbstractRange) from the arguments.

This line gives a hint, but is not as precise. Is returning a UnitRange more of an implementation detail or guaranteed semantics? It is unit tested and IMO it should be guaranteed, so documenting it is appropriate I think.

@mkitti
Copy link
Contributor

mkitti commented Jan 8, 2021

The only place I see UnitRange unit tests from constructed ranges is for (:).

julia/test/ranges.jl

Lines 991 to 992 in 65898ed

@test convert(UnitRange{Int}, 0:5) === 0:5
@test convert(UnitRange{Int128}, 0:5) === Int128(0):Int128(5)

My suggestion is that we use === to compare to similar (:), colon, based syntax in situations where we expect them to perform similarly. Comparable output to (:) for similar input is a fairly minimal contract for this method.

        r = 4:9
        @test range(start=first(r),               stop=last(r)                  ) === r
        @test range(start=first(r),                             length=length(r)) === r
        @test range(start=first(r),               stop=last(r), length=length(r))  == r # Does not produce a `UnitRange`
        @test range(                              stop=last(r), length=length(r)) === r

test/ranges.jl Show resolved Hide resolved
@jw3126
Copy link
Contributor Author

jw3126 commented Jan 11, 2021

Can this be merged now?

@mbauman mbauman merged commit 97bd48c into JuliaLang:master Jan 12, 2021
@jw3126
Copy link
Contributor Author

jw3126 commented Jan 12, 2021

Great, that this one is finally merged 🥳

@mbauman
Copy link
Member

mbauman commented Jan 12, 2021

Thanks for your persistence and patience!

@mkitti
Copy link
Contributor

mkitti commented Jan 13, 2021

I submitted #39223 as a pull request to implement start=1 when start is not provided as an argument.

ElOceanografo pushed a commit to ElOceanografo/julia that referenced this pull request May 4, 2021
…docs and error messages (JuliaLang#38041)

Mathematically a range is uniquely determined by three out of four of start, step, stop, length.
Furthermore if one assumes step=1 any combination of two others macthematically suffices to specify a range.

With this PR the range function reflects this. Any combination of three (two non step) arguments will be accepted.

Co-authored-by: Michael Abbott <32575566+mcabbott@users.noreply.github.com>
Co-authored-by: Johnny Chen <johnnychen94@hotmail.com>
Co-authored-by: Mark Kittisopikul <mkitti@users.noreply.github.com>
Co-authored-by: Matt Bauman <mbauman@juliacomputing.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.